optimized iterative de bruijn graph assembly pipeline (SourceForge net)
Structured Review

Optimized Iterative De Bruijn Graph Assembly Pipeline, supplied by SourceForge net, used in various techniques. Bioz Stars score: 90/100, based on 1 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
https://www.bioz.com/result/optimized iterative de bruijn graph assembly pipeline/product/SourceForge net
Average 90 stars, based on 1 article reviews
Images
1) Product Images from "Conservation of Gene Cassettes among Diverse Viruses of the Human Gut"
Article Title: Conservation of Gene Cassettes among Diverse Viruses of the Human Gut
Journal: PLoS ONE
doi: 10.1371/journal.pone.0042342
Figure Legend Snippet: A) Shotgun sequences are produced from two different genomes (shown in blue and red at the top). Those sequences are used to construct a de Bruijn graph, where nodes are formed by all possible sequences of length k-1 (in this case 4 bases), which are connected by edges of length k (5 bases). Since there are no 4mers shared between these two example genomes, the resulting de Bruijn subgraphs are separate. B) Nucleotide polymorphisms are better resolved by short kmers. We consider a mixture of four genomes, each with three polymorphic positions separated by 25 bp. The identity at each polymorphic position is represented by either blue or red to indicate different nucleotides. At all other positions the genomes are identical. The de Bruijn graph that is constructed from this mixture of genomes using a kmer of 23 is shown on the left, where three independent bubbles form around each polymorphic position. The equivalent graph at k = 27 is shown on the right, where three independent sets of bubbles overlap, forming a more complex and suboptimal graph structure. C) Short regions of similarity are better resolved by long kmers. We consider a mixture of two genomes which are entirely different except for a 25 bp region of sequence identity (shown in black). The de Bruijn graph that is constructed from this mixture at k = 23 is shown on the left, where the two resulting subgraphs intersect at the 23mer of similarity. The de Bruijn graph at k = 27 is shown on the right, where the two resulting subgraphs (corresponding to the two genomes) do not intersect, since they have no 26mer in common. The examples in B and C together illustrate how different kmers can be optimal for assembling graphs with different types of polymorphisms.
Techniques Used: Produced, Construct, Sequencing